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Before we combine actions and probabilities two very obvious questions should be asked. Firstly, 
what does "the probability of an action" mean? Secondly, how does probability interact with nonde- 
terminism? Neither question has a single universally agreed upon answer but by considering these 
questions at the outset we build a novel and hopefully intuitive probabilistic event-based formalism. 

In previous work we have characterised refinement via the notion of testing. Basically, if one 
system passes all the tests that another system passes (and maybe more) we say the first system is a 
refinement of the second. This is, in our view, an important way of characterising refinement, via the 
question "what sort of refinement should I be using?" 

We use testing in this paper as the basis for our refinement. We develop tests for probabilistic 
systems by analogy with the tests developed for non-probabilistic systems. We make sure that our 
probabilistic tests, when performed on non-probabilistic automata, give us refinement relations which 
agree with for those non-probabilistic automata. We formalise this property as a vertical refinement. 

1 Introduction 

Event-based models are frequently based on finite automata (FA, also called labelled transition systems) 
and probabilistic event-based systems are frequently based on FA where the transitions are also labelled 
by a probability as well as by an action. Before we combine events and probabilities two very obvious 
questions then arise. Firstly, what does "the probability of an event" mean, or what does it mean for an 
event to "behave in a probabilistic fashion"? Secondly, how does probability interact with nondetermin- 
ism? Neither question has a single universally agreed upon answer but by considering these questions at 
the outset we build a novel and hopefully intuitive probabilistic event-based formalism. 

Throughout we will be motivated by a wish to, in the end, develop a notion of refinement for proba- 
bilistic systems. In fact, refinement will be the starting point of our story here as well as the desired end 
point. 

In previous work we have characterised refinement via the notion of testing. Basically, if one system 
passes all the tests that another system passes (and maybe more) we say the first system is a refinement 
of the second. This is, in our view, an important way of characterising refinement since the question 
"what sort of refinement should I be using?" can be answered by saying "you should be using the sort 
of refinement that is characterised by the sort of tests which characterise the contexts within which your 
system will find itself, i.e. choose your refinement by looking at what contexts your systems will be used 



Because this seems such a natural and useful answer, we use testing again in this paper as the basis 
for our refinement. We develop tests for probabilistic systems by analogy with the tests developed for 
non-probabilistic systems, all the while hoping to make sure that our probabilistic tests, when performed 
on non-probabilistic automata (and just noting whether a probability distribution is empty or not), give 
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us refinement relations which agree with for those non-probabilistic automata: this gives us confidence 
that our new notions make sense. We formalise this property in Section |7] 

The real test (!) in all this comes when we consider probabilistic automata which also contain nonde- 
terminism. Again, we are guided by the wish that our probabilistic tests, when used on nondeterministic, 
non-probabilistic automata, give us a refinement ordering which agrees with that originally given for 
those automata when probability was not considered. We also find that the algebraic properties that 
characterise the non-probabilistic case carry over into our new domain. 

We formalise a notion of refinement based upon probabilistic tests and then try to (re-)capture what 
nondeterminism means in this probabilistic setting. 

We will first introduce transition systems as a semantic foundation for non-probabilistic automata 
and recap previous work on using testing to define refinement for such systems. 

It will turn out that part of the key to doing this for probabilistic systems is to be clear about two 
different philosophical bases for probability, so we next review those. Another part of the key to this 
work will be a consideration of how nondeterminism is characterised, so we will go on to discuss that 
subsequently. This will finally suggest how we might adapt transition systems to allow consideration of 
probability, and we finally show how this adaptation can be used to also allow a treatment of nondeter- 
ministic probabilistic systems, all the while retaining our testing-based notion of refinement. 

We also show (via a selection) that expected properties hold for our refinement. 



2 Transition systems 

Definition 1 Finite Automata (FA). Let Act be a set of actions and let Act'^ be the same set along with 
the special action T, which represents actions interacting to form events. Let N/\ be a finite set of nodes. 
The finite automaton A is given by the triple {N/\,S/i,,T/\) where 

L Sfi^Q N/\ is a set of start nodes 

2. Tfi, C {(n,a,m)\n,m £Nfi,Aa GAct'^} shows the effect of each action. 

We write x— ^a3' for {x,a,y) € Ta and x-^y where A is obvious from context. We write for 
3m.(n,a,m) S Tp,, and m-^n for 

dm\ . . .mj.m — >m[,mi — >m2,. ■ .mj — >n 

and m-^ for 

3 PK P^. P'. 

dmi . . .mi,n.m — >mi,mi — >m2,. ■ .mi — >n 

when p = (pi, ...,p,), a finite sequence of actions. 

We write n^^m for n—>m, n^^m for 3j,k.n^^j Aj-^kAk^^m and n^^ for 3j,k,m.n^^j A 
j — >k Ak^^m. 

m^^ and m=^n are defined similarly to the cases for — >. 

Where p is a sequence of actions over Act^ we write po for p with the Ts removed. 

The traces are Tr{A) =^ {p | G 5a A 5'=^}. 

The complete traced are rr''( A) =* {p \ [s £ Sp, A s=^n A n{n) = id) where n{n) {m\n^^fi,m}. 

'We deal with only acyclic automata and so we do not need to deal with infinite traces, though all the work of this paper can 
be extended to infinite traces and cyclic automata in the standard way (T). 



86 



Refinement for Probabilistic Systems with Nondeterminism 



We wish to model, using our automata, components that, like CSP processes, can immediately be 
nondeterministic. But, unlike CSP, we wish hiding (abstraction) to distribute through choice (so Ts are 
used only for unobservable actions or for events, and not pressed into service to encode nondeterministic 
choice between starting states). There is a subtle difference between how external choice in CSP and 
choice in CCS behave with processes containing initial T actions. This has been explained either by 
regarding the choice operators as being different, see 121 "The unique choice operator of CCS, denoted 
by +, is a mixture between external and internal choices" or by viewing CSP's use of T actions to model 
a nondetermined start state as different to CCS's use of T actions 121. By allowing automata to have a set 
of start states we both avoid having to distinguish external choice and CCS choice and allow hiding to 
distribute through choice 

Also, choice can be defined (El [51) between FAs with one start state each by gluing the two start 
states together to make a new single start state. Here, due to our generalisation, we glue together two sets 
of start states. 

Let S = {si,S2t.. ,s„} and S' = {s\,s'2,. . . ,5',,} be two sets of starting states and then define {S/S x 
S'} to be the n substitutions {st & S\si/{{si,s[),. . . , and define {S' /SxS'} to be the m substitu- 

tions {s'j G S'\s'j/{{sus'j),...,{s„,s'j)}}. 

We define {SS'/S x 5"} to be the n + m simultaneous substitutions {S/SxS'}U {S'/S xS'}. The first 
n substitutions replace each element of {si,S2, ■ ■ ■ ,Sn} with a set of m nodes and the last m substitutions 
simultaneously replace each element of {s[,s'2,. ■■ ■,s[„} with a set of n nodes. Consequently {S^Sb/Sa x 
^b} will identify the two sets of nodes Sa and Sb as Sa{Sp,Sb/Sa xSb} and Sb{SaSb/Sp, x 5b} are both 
the « X m set of nodes 5a x 5b. 

Since single states may now become sets of states under the substitution, we also have to define what 
it means to have sets of nodes in a transition: 

T^r = {t^t'\teT,t' eT'} 

Definition 2 Process operators. Let A be {Na,S/\,Ta) and let B be {Nb,Sb,Tb). 
Action Prefixing a.B = =^ ({5'} U A^Bi {s-^x\x € 5b} U Tb) where s is a new state. 

def 

Internal choice An B = {NaUNb,SaU Sb,T/\IJTb) 

External choice is, informally, internal choice where start states are combined according to the substitu- 
tions above. Let S^ub be U(('^A U5b){5a5b/5a x 5b}), i.e. we combine start states as above. Then, 

Hpf 

External choice kUB = ((A^a UA^b) \ (^a U5b) U5AnB,5AnB, (TX U rB){5A5B/5A x 5b}) 

def 

Parallel composition: A ||p B = (A'^a||pB,'S'a||pB, 7a||/.b) where P CNaCiNb, N^pB=NaxNb, S^w^b = 
5a X 5b and 7a||^b defined by: 

n^^A')"^~~^B^)XeP 
(?i,m)^A||pB(^^) 

n^^/\l,(x(^PAmENB) n^^B^ (x^PAmeWA) 

{n,m)^A\\pB{l,m) (m,n)^A|[pB("J,0 
Example 1 Let A be 

({Sl , 52 , ?1 , ?2 } , {■Sl , .52 } , {■Sl -^A?l , ■Sl -^A?2} ) 

and let B be 

{{s,S2,t},{s},{s^Bt}) 

or, in diagram form, 
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A ■^i 



S2' 



?1 6 ?2 o 



B s» 

C 

t 6 



Then An Bis 



or, as a diagram, 



{{si,S2,S,h,t2,t},{si,S2,s],{si^^UBt\,S2-^J\nBt2,S-^j\nBt}) 



AnB^i* S2 



s • 



^10 ?2 o to 

Given that Sadb is 

\J{s\,S2,s){sil {{SI,S)},S2/{{S2,S)},S/ {{SI,S),{S2,S)}} = {{S]_,s) ,{s2,s) 

then ADB is 

{{t2,h, {Sl,s), {S2,S)},{{SI ,S), {S2,S)}, 
{{s\,s) -^ADB^l , {S2 , s) -^AnB?2, { {si , s) , (^2 , s) } -^ADB? }) 

which is 

{(^1 , -H-ADBfl , (-Si, -^ADBfl, (^1 , s) -^ADB? }, (-yi, >y) -^ADB? }) 

and as a diagram 

ADB (^2,J)» 

?t ?20 to 
Finally, A B with (note that B'j action is now aj 

A -^i* ^2* 



a 

ti 5 



b 

?2S 



AllwB {sus) 



ih,t) 



is2,s) 



{t2,s) 



B s» 

a 

t& 



is2,t)o 

b 

{t2,t) 6 



□ 
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3 Testing semantics 

The definitions in tins section are taken from 161 where they have been applied to both state-based and 
event-based models. 

One of our tests, of a process E, taken from a set of processes E, consists of placing E in some 
context X taken from a set of possible contexts S. E in context X is written [E]x. We then observe the 
resulting system. Each observation made is taken from a set of possible observations O. 

We turn first to our general definition of testing semantics for nondeterministic processes and con- 
texts. In this setting a test may return (nondeterministically) one observation from a set of possible 
observations. 

A specification is interpreted as a contract consisting of the assumption that the process will be 
placed only in one of the specified contexts E and a guarantee that the observation of its behaviour will 
be one of the observations defined by the mapping O : E — )• E ^ pO. The mapping O defines what can 
be observed for all processes in any of the assumed contexts. Hence for any fixed E and O we have a 
definition of the semantics and the refinement of processes. 

Definition 3 Let E be a set of contexts each of which the processes A, C G E can communicate privately 
with, and Ze? O : E ^ E — t- pO be a function which returns a set of observations, i.e. a subset ofO. Then, 
the relational semantics of a process A is a subset o/E x O. 

= {(x,o)|xeEAoGO([Ay} 



and refinement is given by 
and equality is 



□ 



Given a rich enough class of tests the use of nondeterministic tests is redundant, as what can be ob- 
served using a nondeterministic test will be the union of what can be observed using a set of deterministic 
tests. Hence nondeterministic tests add no further information and will be ignored. 

For all the processes considered in this paper, placing a process A in a context X, i.e. [A]x, will mean 
executing process A in parallel with X, i.e. A Hat X (where N is some set of actions over which the context 
and process communicate, i.e. synchronize) and the observation function O is either the trace function 
Tr (if only safety properties are of interest) or (if liveness properties are of interest) the complete trace 
function Tr^ . 



Definition 4 Let Ef^be FA and let be QzfAjr''- ^ 
Tlieorem 1 Refinement distributes through parallel composition: Let X, Y, P, Q € FA 

XEfAY,PCfAQ 
X \\n P Efa Y \\n Q 

□ 
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4 Probabilities — Two Interpretations 

There are two (main) interpretations of probability, the frequentist and the Bayesian. 

The frequentists' definition sees probability as the long-run expected frequency of occurrence. The prob- 
ability of event A happening, where n is the number of times event A occurs in N opportunities, is 

P{A)=n/N. 

The Bayesians ' view of probability is related to degree of belief or state of knowledge. It is a measure of 
the plausibility of an event given incomplete knowledge. The Bayesian probabilist specifies some 
given or assumed prior probabilities, which are then used in the computation of other probabilities. 
That is to say, anything that is nondeterministic or unknown must either be assigned some proba- 
bility or have its probability computed from other, more primitive, known probabilities. Bayesian 
statisticians have developed several "objective" methods for specifying prior probabilities. 

The frequentists' view is based upon repeatedly performing the same test many times and, where the 
behaviour of the item under test is nondeterministic, aggregating the results of all the tests. Extending 
an event-based testing semantics to record not just the set of possible observations but the probability 
with which they occur is a simple uniform way to extend event-based testing semantics to event-based 
probabilistic testing semantics. This can be further generalised by representing both the process under 
test and the test process itself with probabilistic automata. 

The Bayesian view fits well with Hoare's comment on nondeterminism E p81]: 

''There is nothing mysterious about this kind of nondeterminism: it arises from a deliberate 
decision to ignore the factors which influence the selection" 

So, nondeterminism in a process is merely a case of not having analysed it enough to quantify it, i.e. 
attach to it some probabilities. Nondeterministic choice is probabilistic choice with unknown probabili- 
ties. Surprisingly, this is not how testing semantics have been defined in the literature. 

As probabilities quantify (i.e. attach a number to, or make quantitative) nondeterministic behaviour, 
it is clearly crucial when modelling some real process to distinguish between the behaviour of the process 
being deterministic and the behaviour being nondeterministic. Similarly when the process is observed 
interacting in some context it is crucial to distinguish the nondeterminism of the process from the non- 
determinism of the context. 

Give a coin to a frequentist statistician and they experiment by flipping the coin a large number of 
times noting down the number of times they observe heads being uppermost and the number of times 
they observe tails. From this experiment they can compute the probability. 

An important point to note is that, to the frequentist, probabilities define how likely it is that an action 
is executed, or equivalently how likely it is that the execution ends in a particular state. The probability 
of an event occurring when the event cannot be executed must be zero. 

The Bayesian statistician, given a coin, knows that the only observations are heads and tails, and 
has no further information. The skill of the Bayesian statistician is to assign a prior probability based 
on understanding the world that agrees with the frequentist. It becomes very important when we try to 
add probabilities to event-based processes that we either follow the frequentist and perform experiments 
(tests) or follow the Bayesian statistician and think clearly about the behaviour in the world of what we 
are modelling. 
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Figure 1 : Probabilities on starting states 




si»p.q S2M l-p).q t •! - q 

/q\ /q\ /a 



Figure 2: More general probabilistic combination 



5 Probabilistic Finite Automata 
5.1 Probability 

We introduce probabilities on choice by attaching probabilities to the start states of a process. There are 
two things to notice here: as in the non-probabilistic case with FAs, we represent nondeterminism on the 
initial state of a process by allowing the process to start in one of a set of states; and we generalise this 
idea to represent the probability of starting in some state of a process by attaching probabilities to each 
of its start states so that we can see what the probability of each possible start state being actually chosen 
for some particular execution of the process. 

The first of these points is inherited from work fS] which seeks to remove the need to use unob- 
servable actions to also "encode" or represent nondeterminism in a process by assuming the process 
makes an unobserved transition to its "real" starting state (which may be one of many) from some single 
"dummy" formal starting state. (And, of course, this is just a case of using the usual "set of states" model 
uniformly for start states as well as all other states, which is something we are all familiar with from the 
"classic" algorithm that constructs a deterministic finite-state automaton from a nondeterministic one.) 
Such unobserved actions can then be used exclusively to denote (synchronisation between) events. This 
idea is, in the second point above, carried over into the probabilistic realm so that initial probabilistic 
choice is replaced by a probability distribution over the possible starting states. 

So, if P is the process that starts with a choice between Qi and Q2, which have (single, for this 
illustration) starting states 5'i and ^'2 respectively, with probabilities p of starting in state and 1 — p of 
starting in state S2 then we might picture P as in the left of Figure [T] We might represent the picture by 
saying S{P) = {s\ p,S2 ^ I — p}, where 5 is a probability distribution function over start states of P. 

Further, if we now form the process a.P (i.e. the event a happens then the process P happens) then 
we might picture this as in the middle of Figure [T] and here notice how the probabilities have migrated 
to the occurrences of event a. This picture suggests that transitions now represent the effect of an action 
on an initial state moving the system, according to some probability distribution, to the next state, when 
it synchronizes with the same action in some other process, i.e. when the two actions combine to form 
an event which takes place with the indicated probability. 

So in a.P, the action a has the potential to move us from state t to state si with probability p 
and to S2 with probability I — p when synchronized to form an event which actually does take place 
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with the indicated probabilities. We formalise all this by saying that the transitions of a.P include 
{t — >d I d{si) = p Ad{s2) = 1 — p}- An alternative picture might be as shown in the right of Fig- 
ure [H and here notice how the probabilities on the new start states for the new process a.P have migrated 
from the old start states of P and we have S{a.P) = {ti p,t2 ^-^ I — p}. This picture might be consid- 
ered a useful, though perhaps more unusual, alternative way of thinking of our system in the previous 
picture. 

Note that the original form of transitions as in FAs can be recovered by using the domain of the 
probability distribution function to tell us what the relevant post-states are. 

As processes are combined together, the probabilities for the various component start states are com- 
bined to form the probabilities for the start states of the combination. As an example, see Figure [2l which 
shows what the resultant start-state probabilities are for {Qi +p Q2) +qP, where ,^'2 and t are the start 
states for Qi,Q2 and P respectively. 

5.2 Probability and nondeterminism 

From statistics, the law of large numbers tells us that nondeterministic behaviour is the same as proba- 
bilistic behaviour where the probabilistic behaviour is unknown but can be found by repeating the right 
experiment a large number of times. 

In process algebras T actions indicate hidden, unobservable, uncontrollable actions or events (a spe- 
cial case being when two processes synchronize on some actions, which we consider to be private and 
uncontrollable). Remember Hoare's comment that we cited in Section HI We have said above that we 
view this as agreeing with the Bayesian idea that probability indicates a lack of information. 

As probabilities refer to frequencies of executable behaviour, i.e. the probability of an event occur- 
ring, they naturally occur on t actions. The intuitive relationship between nondeterminism and probabil- 
ity is widely held. For example, 

"nondeterminism represents possible choices that can be resolved in a wholly unpredictable 
way. With probabilistic constructs the resolution becomes predictable up to a point, in that 
it is quantified" [9] 

We can view this as saying that probabilistic processes contain more information than nondeter- 
ministic processes but less than deterministic processes. Consequently what can be observed in any 
single observation of a probabilistic process is the same as what can be observed of the underlying 
non-probabilistic process. But by aggregating the observations of a large number of executions we can 
compute a probability distribution or verify a previously computed probability distribution. 

As T events are built by composing two actions that ai^e observable (via parallel composition, i.e. 
synchronization) it would be useful to find some way to compute the probability of the executable T 
event from the prior "probabilities" of their observable parts. This we do below in Definition [H 

The addition of probabilities to observable actions where there is no nondeterminism has proven both 
hard to interpret and hard to formalise, especially when we want to ensure that the models have desirable 
properties. One reason, in our opinion, that this has turned out to be so hard to do is that the probabilities 
on the observable actions need, obviously, to define the behaviour of the processes not just in one context 
but in all contexts]! 



^We go no further with this point in this paper, but note that, in the non-probabilistic setting, we have considered this 
previously in 1101 . 
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5.3 Nondeterminism 

We represent nondeterminism not by a separate set of operators but by allowing probabilities to be de- 
noted not just by real numbers in the range to 1 but also by real- valued terms (in that range) that contain 
variables or parameters. This introduces the idea of a starting-state distribution which is not completely 
determined or which has undetermined aspects, and hence allows us to represent nondeterminism with 
the same machinery that we introduce for probabilities. 

This idea is motivated by the Bayesian view that the more we know about a mechanism, the more 
certain we can be about the probabilities attached to its behaviour: to talk of nondeterministic behaviour 
is merely to admit having more or less incomplete information about how something behaves, and this 
incomplete information can be represented by having parameters in the terms which denote probabilities. 
This also accords with Hoare's view that nondeterminism arises from ignoring or hiding (or, we would go 
further and say, being ignorant of) some aspects of a process. Further analysis of the mechanism would 
uncover ("unhide") more of the mechanism. This view dissolves nondeterminism; there is no such thing 
really, since it is just arises from not knowing (for whatever reason) enough about the actual distribution 
of probabilities amongst actions that might be taken when a choice is presented or confronted. 

5.4 Probabilistic testing semantics 

For probabilistic tests all we need change is that the user records not just a set of observations but a 

def 

probability distribution over a set of observations, hence O = Act* — )• R. 

The relational semantics of process A when probability distributions are observed is a subset of 
E X [Act* — M). If a process is experimented upon (frequentist perspective) and the results noted then 
what is observed will be a function E — [Act* — M) and hence there is no nondeterminism and no 
possibility of refinement. 

But approaching automata from the Bayesian perspective, if we can define the processes and tests as 
prior "probabilistic" automata then we might be able to use probabilistic parallel composition to compute 
the probabilistic relational semantics of the processes. From the Bayesian point of view, the probabilities 
on actions are prior probabilities that, until the action takes part in an event by being synchronized with 
another process along the same action, do not play any role. Obviously the probability of an unexecuted 
action is prior to the probability of an execution — in particular, not until we factor in the probability 
of the synchronizing action do we know (via their product) what the probability of the executed event 
(denoted by t) will be. So, it is the Bayesian ideas that allow us to make sense of attaching probabilities 
to something that has not yet happened, and which will only be a part of what happens. 

6 Formalising probabilistic automata 

In this section we will formalise the discussion in Section 15.21 and see that automata that contain both 
probabilistic and nondeterministic choice are called partially probabilistic introduced as parameterised 
probabilistic finite automata (PPFA). Here we take what we see as the standard statistical approach and 
model nondeterministic choice as probabilistic choice with unknown probability. So our probabilities 
are no longer only real numbers but may also be real-valued terms (parameterised terms, hence the 
name) that may contain variables, the unknown probabilities. Automata where nondeterminism has been 
completely replaced by probabilistic choice are deterministic probabilistic finite automata (DPFA). 

Definition 5 Parameterised Probabilistic Finite Automata (PPFA }. Let N^, be a finite set of nodes. The 
parameterised probabilistic finite automaton A is given by the triple {N/\,S/i,,T/\) where 
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1. Sfi, is a "starting distribution", i.e. a parameterised probability distribution such that dom{Sfi,) C 
N/\, where dom{SA) are the starting states of A 

2. C {(«, a,(i)|?i € N/\ A a € Act''- Ad G D/\}, such that for each n € A'^a and a € Act''' there exists 
no more than one element of Tj\ with first component n and second component a, and recall that 
nondeterminism is modelled by a parameter in the range of the probability distribution d. Finally, 
Da is a set of probability distributions over states. 

Deterministic Probabilistic Finite Automata (DPFA) are PPFA with the restrictions that: 

1. The ranges of all probability distributions are sets of real values, not sets of possibly parameterised 
terms, i.e. the elements of the ranges contain no variables; 

2. {n,a,d) ETp, implies a G Act . 

□ 

Let the variables X, Y be taken from some set Var and X be a list of variables and i/^ be an instanti- 
ation of the variables in the Ust taken from the set of all such instantiations We will write A(X) for a 
PPFA containing variables X, but where not needed the list of variables will be dropped and we will write 
A. We interpret the variables in A(X) as being globally bound and take the usual a-congruence of terms 
and identify PPFA that differ only by the names of variables used. Similarly we assume a-renaming to 
prevent confusion and variable capture when composing PPFAs. 

We write x— ^a.^J for {x,3,d) G Tj\ Ad{y) = p and x— where A is obvious from context. In 
addition when we want to talk about a "complete" transition, i.e. one that has its associated final state 
distribution, we write x-^^d for {x,a,d) G Ta- 

Definition 6 The probability of the computation following a path, a sequence of transitions starting from 
a start state s, is the product of the probability of its component transitions and the probability of starting 
in the start state Sa{s). Let p be the path s-^p^m\.,m\-^p^m2, ■ . .m„_i-^p„m„. Then the probability 
that p is executed is 

def 

d{p) = Sa{s) X piX P2X ...pn 

and we say that the path p can be observed as trace p — Pi,p2, . . . ,Pn. 

The probability of observing a trace p is the sum of the all probabilities of the computation following 
any path that can be observed as trace p: 

d{p)= I d{pi) 

tr(pi)=p 

where tr{p) =^ {p\p = s^p^ml,m\-^p^m2,. ■ .mn-\-^pjnn}. 

Writing 5a informs us that p is the probability of seeing the trace p when starting in any of the 

start states in dom{SA) and following some appropriate path, i.e. d{p) = p. S^-^pn means that p is the 
probability of seeing the trace p when starting in any of the start states in dom{SA) and ending in state n. 

Definition 7 The probability distribution over complete traces is 

D'{A) = {p^Y^q\P={q\neNAAK{n)=(dASA-^qn}} 

qeP 
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Definition 8 Process operators 

Action Prefixing a.B =^ ({^q} UA^Bj {^a ^ 1 } , {■^a ^^^b } U T^) where s^ is a new state 

def 

Internal choice A □ B = {N/\ U A^b ) "^a n 5b , 7a U Tb) where 

{Sfi, n SB){n) = X X Sfi,{n) if n € dom{S/\) else (1 — X) x Sb(«) if n ^ dom{SB), where X is a fresh 
parameter, and note that now dom{St^ □ 5b) = dom{Sfi,) U dom{SB). 

def 

Probabilistic choice A©p B = (A'a UA^Bj^a ®p5b,7a U Tq) where (5a ©p5B)(n) = p x 5a (n) ?/ n £ 

dom{Sj\) else (1 — /?) x 5b(«) G dom{SB), and note that now dom{S/\ (BpSo) = dom{SA)Udom{SB)- 

We note immediately from this that internal choice is probabilistic choice with unknown probability 

between the two choices. 

def 

External choice = (A'AUA^B\(<^cm(5A)U<^om(5B))UJom(5AnB))'5AnB)7AU7B{{5A5B/5A x 

5'b}}) where 5AnB('^A,'^B) =5A(nA) x 5b («b) and {{5a5b/5a x 5b}} now, of course, uses the domains 
of the start state distributions in order to build the substitutions over start states. 
Parallel composition: 

A ||p B =^ (A'^a||pB>'^a||/.B)7a||pb) 
A^aiIpB =A^a xA^b 
5'a||pb(«a,"b) = 5a ("a) X 5B(nB) if "a € dom{S;^) Awb e dom{SB) 
and 7a||pB is defined by: 

n^ f^df^,m^BdB,^<^P 
(n,m)-^(A||pB)^^A X(iB 

n^^p,dj\, (x^PAmeWe) n—>BdB, (y^'^P'^meN/^) 

(n,m) ^(A||pB)^^A X m {m,n)^(^A\\pB)m x ds 

where 

JaxJb = {ix,y) t-^ dAix).dB{y)\n^AxAm^By} 

and 

dfi,xm = {{x,m) df^{x)\n — >p,x} 

and 

mxdB = {im,y) ^ dB{y)\n-^By} 

Example 2 Consider the PPFAs given by the expressions a. {Qi +p Q2) and a.Qi +p a.Q2. Then, assum- 
ing the start states, states and transitions ofQ\ and Q2 are given by si, S2, Ni, N2, T\ and T2 respectively, 
we have 

a.Qi+pa.Q2 = {{ti,t2}UNiUN2,{ti^p,t2^l-p}, 

{ti^dut2^d2\di{si) = d2{s2) = i}uriur2) 

a.{Qi +p Q2) = {{t} LINiUN2,{t^l}, 

{t^d\disi) = p,d{s2) = 1 - ;?} U Ti U 72) 

In fact, these PPFAs are indistinguishable by testing, so they are equal (they "refine both ways") as far as 
our testing semantics goes. This result can be generalised so that probabiUty distributions on transitions 
can always be "migrated" to the starting state distribution. 
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6.1 Testing of probabilistic processes 

Recall from Section[3]that we said in the definition of our testing semantics for FA that we will use [A]x = 
A \\n X and Ofa = Tr'^. For probabilistic FAs we need to use parallel composition from Definition [8] (as 
defined for DPFA and PPFA). The observation of a single execution of a DPFA is still a trace but what 
can be "observed" over many executions is no longer simply a set of traces but, if we also record the 
frequency of occurrence of the traces, a probability distribution over the set of traces hence Odpfa = D'^- 
We treat PPFA similarly and let 'EpppA = PPFA and Opppa = except that now the observed probability 
distributions may be parameterised. 

Definition 9 The relational semantics of an entity A(X) is (where is the set of instantiations for the 
parameters in Xj 

[A(x)i.,,,,,z)^ {[x,o).xeZpppA^oe^f^{D%{\k{X)m^^if^e^>^} 

A(X) □.,,,,,0. C(Y) [C(Y)ls,,,„D^ C [A(X)l.,,,„z,. 
k{T)=pppAQ{Y) IC(Y)l.,,,,,z,. = [A(X)1.,,,„^. 

Note here that we have given the meaning of PPFAs as a relation from contexts (PPFAs) to probability 
distributions: 

[A(X)lspp^^,D.- C ZppFA X {Act* ^ Real) 

by instantiating all the open distributions that might be observed to get plain probability distributions 
"with no unknowns". 

def 

Let ^ppFA = ^ZppFA;iy- That is, we write ^pppa for this general definition of refinement. When 
QppFA relates two DPFA processes it is of little interest, i.e. there are no opportunities for refinement as 
there is no nondeterminism (though there are, perhaps, probabilities). 

In Section|7]we will show refinement of PPFA is strongly related to refinement of an underlying FA. 

6.2 Simple results from the definitions 

Theorem 2 Refinement distributes through parallel composition. Let X, Y, P and Q be arbitrary PPFAs 
and let N C Act. Then 

X Eppfa Y, P Qppfa Q 
X \\n P Qppfa Y \\n Q 

For an arbitrary PPFA P(Y) we have the following theorems. 

Theorem 3 n is idempotent. P(Y) =pppa P(Y) n P(Y) 

Proof: From Definition [8] it can be seen that the graph of P(Y) □ P(Y) consists of two copies of the graph 
of P(Y) which ever copy is selected the behaviour is exactly that of P(Y). Hence he equality. 

Theorem 4 ®p is idempotent P(Y) =pppa P(Y) ®p P(Y) 

Proof: Similar to Theorem [3] 
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7 Relating finite automata to parameterised probabilistic finite automata 

We construct [_]]ppp^, an embedding of FA into PPFA and a forgetful mapping from PPFA to FA, and 
then show that these mappings form a Galois connection between the refinement relations Qppfa and 

EfA- 

The embedding |-]]ppf>i of FA in PPFA will map all nondeterministic choices in FA processes into 
probabilistic choice with unknown probabilities in the PPFA processes. 

Definition 10 Semantic mappings l-Jpp^^ and vApp^^ between finite automata A and parameterised 
probabilistic finite automata Ap are defined so that: 



[(■^A,'S'A,7A)IppfA — (■^Ap,'S'Ap,7Ap) 



where 



and 



A^Ap = A^A 

def 

5ap = {{s,T) \seS^^Xis fresh A {^nedom{Sf,p)Shp{n)) = 1} 

^Ap = {(«>a,(i) I (i = {m i-> V I n^-^mAv is fresh} A {J^medom{d)d{m)) = 1} 
The mapping vAppp^from PPFA in to FA forgets all probability distributions: 

vA^j.p^{NAp,SAp,TAp) = {NA,SA,Tfi,) 

where 

, , def , , 
A^A = A^Ap 

and 

5a = dom{Ssp) 

and 

Tj\ = {{n,a,m)\n-^j\pd Am G dom{d)} 

□ 

The pair of mappings ([-Ipp^/i, vApp^^) define a vertical refinement Eppp^ as they are a Galois con- 
nection |10|. This is the content of Theorem|71 but first some preliminary results. 

Lemma 1 For any FAs X and Y 

rr^'(X) C Tr'=iY)^D''-m^j.p^)CD%lYj^j,p^) 

Proof (Sketch) The application of [-Jppp^ to a FA simply adds parameterised probabihties spanning 
any nondeterministic choice. The set of all possible observation traces is Tr'^(X). This is also the set of 
all possible observation traces of [XJppp^ but now what is "observed" is not one trace but any probability 
distribution over any subset of 0{X) (we need to use subset as when the probability of observing a trace 
is it is no longer in the domain of the distribution). 

Hence d € D%lXj^j,p^) ^ dom{d) C Tr'{X). Consequendy if J € D'{lXfpj,p^) then dom{d) C 
Tr^{X) and since Tr'^{X) C rr'^(Y), from the assumption of the lemma, we further have dom{d) C 
Tr^iy). Then d € O'^dYJppp^) follows from the argument above with Y in place of X. • 
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Theorem 5 Let X and Y be FAs, and let N C Act. Then, 

[X \\n ^jpPFA = I^IPPFA \\n l^jpPFA 

Theorem 6 Let X and Y be PPFAs, and let N C Act. Then, 

vA^i^^iX U Y) = vA^iM \\n vA^j^^Y) 

Definition 11 Deterministic automata. 

DetFA = {P\{n^k^n^l^k = l)^\S^\ = \} 

DetppFA = {P\{n^pkhn^ql ^k = l ^p = q = \) ^\S^\ = \} 

Lemma 2 Results involving deterministic automata. 

L (a) {X G DetFA \ P<\ppfa} = DetpppA and 
(b) {Y G DetppFA I vA^%^{Y)} = DetpA 

2. Let A and C be FAs. Then AQfaC-^ ^xeDet^j, ■Tr'{[^]^) ^ Tr^i^lC]^) 

3. Let A and C be PPFAs. Then A CppfA C ^ V.veD^,^^^^ •■^''([A];,) 5 ^^'^([C];^-) 

Proof (Sketch). 

1(a) and l{b) follow from definitions. 

Re 2: With non-probabilistic processes and tests, what can be observed when applying a nondeter- 
ministic test is the union of what can be observed when applying each element of the set of deterministic 
alternatives (where here we picture, as usual, a nondeterministic computation as a set of deterministic 
ones which covers all the possible choices) and hence: 

AQfaC^ y,eDet,,-Tr'mx) 5 Tr%[C],) 

Re 3: With probabilistic processes and tests, what can be observed when applying a probabilistic 
test is the distribution, inferred from the test, of what can be observed when applying the deterministic 
components that the probabilistic choice spans. Hence a set of test processes for PPFA that is sufficient 
to establish refinement is the image after applying [-Ipppyi to a sufficient set of FA processes, i.e. since 
DetpA is sufficient for FA then DetpppA is sufficient for PPFA, hence: 

A ^PPFA C ^ y^eDet,,,, [A].) 5 £>^([C];,) 

Theorem 7 

VX G FA, Y G PPFA.lXfpj>PA h^PPFA 

Proof: (Sketch) 

It is a well-known result (e.g. ifTTI ) that to prove a Galois connection it is sufficient to prove for 
arbitrary X 

vAppj7^(|X]]ppp^) ^FA idpA^ 

and for arbitrary Y 

[[vAppp^(Y)]]ppp^ ^PPFA idppFA^ 

and in addition to prove both relations [-Ippp^ and vApp^^ are monotone. 
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We can see directly from the definitions that IJppp^ adds parameterised probabilities to any nonde- 
terministic choice and vAppp^ forgets this addition hence, for arbitrary X : 

which gives our first inequality. 

The effect of [vApp^^YJpp^^ is to first replace probabilistic choice with nondeterministic choice (by 
ignoring probabilities) and then reintroducing probabilities-with-parameters due to the nondeterminism 
and this can be refined, along with other possibilities, back into its original value, which gives our second 
inequality. 

Re: show l-fp%j^ is monotone: kQpA^^ {/^Tppfa ^ppfa IQpipA 

From Definition [3] we have A Qpa C Vjces^ .Tr' ([A];^) ^ rr''([C]jt) and as DetfA C EfA we also 
have 

A C ^ V,eD.,,, . Tr' ( [A],) D Tr' ( [C],) (1) 
From Lemma [T] we then have 

A c ^ y^eoet,, ■D'-xmhrpipA) 5 D^iimpiFA)- 

Then, 

y.eDer,, -D^rnhVpipA) 5 D^imsu) 

V.eD..,,.D^'([[Al^^f^]H«^j2D^'([ICl?^f^]H«^J from Theorems] 

V.ez).r,™ -D'Hm^ipAh) 5 D^llQ^ip^l) Lemma|2]part 1(a) 

m^ipA Eppfa ICj^ipA from DefinitionlH 

4. Re: show vApp^^ is monotone: A Qppfa C vApp^^A Hp^ vApp^^C 
From A Qppfa C and definitions we have: ^xeZppfa •^'^([A].v) 5 ^^{[^]x) 
as DetppfA C 'EpppA we have 

V.eZ).r,™.D-([A],)5D-([C],) (2) 
For all o in rr' (vAppp.^( [C];c)) there must exist a in D'^([C],:) such that o G dom{d) and from (2) we 
know that d is in D'^{[k]x) and with o G dom{d) we can conclude that o in rr'^(vAppp^([A]v)) so: 
y^^Detpp,, .Tr^ivAfpp^mx)) D rr^(vA^^^^([C],)) 

-T^-iivA'piFAKA-}.,,.) 5 Tr-iivA^pipAQvA^'^ppJ Theoremm 
yxeDet,^ .Tr'{[vA^j,P^A]x) ^ Tr' {[vA^ip^Qx) from LemmaHpart 1(b) 

VxgSm .rr''([vA^-^p.^A].0 D Tr^{[vA^pj.FA^]x) from LemmaEpart 3 

vApipA^ Efa vA^-^p^C Definition[3] 

• 

The embedding [-Ippp^ can be used to add probability to a non-probabilistic finite automata during 
the stepwise development, i.e. refinement, of a model or specification. This use of Galois connections is 
nothing new but to the best of our knowledge it is the first time it has been used to allow the introduction 
of probabiUty part of the way through the development of a process. 



8 Conclusions 

Others have used the same testing framework to treat probabilistic processes, but in one notable case 191 
it was found that many of the expected algebraic results were false according to the testing used. This 
meant the abandonment of testing as a basis for refinement and a notion of simulation was introduced. 
We believe that the reason that many of the "sanity checks" turned out to be false for the testing-based 
refinement in that paper was that the original formalisation of nondeterminism found in non-probabilistic 
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systems was kept and that this led to problems when probabiUstic tests on nondeterministic probabilistic 
systems were considered. 

Instead of abandoning refinement based on testing, we handle nondeterminism in a way that is com- 
patible with probability, rather than using the original formalisations of testing nondeterminism found in 
non-probabilistic systems. 

We also note that, having shown we can (as a (vertical) refinement) move from non-probabilistic 
models to probabilistic ones (and back again, if we wish), the introduction of probabilities can happen as 
a design step during development of a system via refinement steps. So, we are free to take a very general 
non-probabilistic specification and, if it turns out to be necessary to do so to deal with some aspects of the 
specification, introduce probabilities as we make progress towards a more concrete form of the system. 
We have not yet explored this possibility, but it does introduce another freedom to the developer which 
might turn out to be useful. 

The framework we have introduced in this paper is really only a first step towards a sensible language 
for specifying systems containing probability. What still needs to be done is to recognise that some sorts 
of probabilistic choice do not "make sense", i.e. that there are right and wrong places to use such choice. 
For example, if we have a vending machine with two buttons on, one for tea and one for coffee, it clearly 
does not make sense to specify the choice here as a probabilistic one — the vending machine would be a 
very odd one if it allowed me to choose tea only 75% of the time! 

On the other hand, it does make sense (though perhaps inventing plausible uses for such a thing might 
be hard!) to specify a robot which can make choices from a vending machine that offers tea or coffee, 
where the robot prefers tea over coffee, so it chooses tea 75% of the time. 

The difference between these two cases is one of causality. The robot's actions cause the vending 
machine's, and not vice versa. So, our specification language would need to allow us to make this 
distinction and, most helpfully, only allow probabilistic choice to be specified in situations where it makes 
sense, as in the case of specifying the robot. We have done previous work on adding causality (back) into 
process algebras, and the work presented here forms the basis for a probabilistic causal process algebra 
(CPA) lHJI, or for a probabilistic language for interactive branching processes (IBPs) lITOl which we have 
also talked about before, which forms the subject of another paper yet to be published. 

A final interesting point to note is that, because we can always migrate probabilities on actions right 
up the probabilities on start states, we have a normal form for our automata. In this form, the only place 
that probabilities appear is on the start states (so the only non-trivial probabilistic distribution over states 
is the start-state distribution). This makes it very clear that one needs only one roll (of dice with enough 
faces) in order to conduct a probabilistic computation. 
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